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PAGE: i RAW. SEQUENCE LISTING DATE: 01/30/97 

PATENT APPLICATION US/08/409,122 TIME: 09:39: J 4 

INPUT SET: S15202.mw 



This Raw Listing contains the General 
Information Section and up to the first 5 pages. 



1 SEQUENCE LISTING 
2 

3 (1) General Information 
4 

5 (i) APPLICANT: JOYCE, JAMES G. ~ _ 

6 GEORGE, HUGH A. p" j\J g G» Q E |^ 

7 HOFMANN, KATHRYN J. 1 *- 11 G \gj 

8 JANSEN, KATHRIN U. 

9 NEEPER, MICHAEL P. 
10 

11 (ii) TITLE OF THE INVENTION: DNA ENCODING HUMAN PAPILLOMAVIRUS TYPE 18 VACCI 
12 

13 (iii) NUMBER OF SEQUENCES: 16 
14 

15 (iv) CORRESPONDENCE ADDRESS: 

16 (A) ADDRESSEE: CHRISTINE E. CARTY - MERCK & CO., INC. 

17 (B) STREET: 126 EAST LINCOLN AVENUE - P.O. BOX 2000 

18 (C) CITY: RAHWAY 

19 (D) STATE: NJ 

20 (E) COUNTRY: US 

21 (F) ZIP : 07065-0907 
22 

23 (V) COMPUTER READABLE FORM: 

24 (A) MEDIUM TYPE: Diskette 

25 (B) COMPUTER: IBM Compatible 

26 (C) OPERATING SYSTEM: DOS 

27 (D) SOFTWARE: FastSEQ Version 1.5 
28 

29 (vi) CURRENT APPLICATION DATA: 

30 (A) APPLICATION NUMBER: US/08/409 , 122 

31 (B) FILING DATE: 

32 (C) CLASSIFICATION: 
33 

34 (vii) PRIOR APPLICATION DATA: 

35 (A) APPLICATION NUMBER: 08/408,669 

36 (B) FILING DATE: 22-MAR-1995 
37 

38 
39 

40 (viii) ATTORNEY/AGENT INFORMATION: 

41 (A) NAME: CARTY, CHRISTINE E 

42 (B) REGISTRATION NUMBER: 36,099 

4 3 (C) REFERENCE/ DOCKET NUMBER: 19425 
44 

45 (ix) TELECOMMUNICATION INFORMATION: 

46 (A) TELEPHONE: 908-594-6734 
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RAW SEQUENCE LISTING DATE: 01 /30/97 

PATENT APPLICATION US/08/409,122 . . TIME: 09:39:16 

INPUT SET: S15202.raw 

(B) TELEFAX: 908-594-4720 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1524 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGGCTTTGT GGCGGCCTAG TGACAATACC GTATACCTTC CACCTCCTTC TGTGGCAAGA 60 

GTTGTAAATA CTGATGATTA TGTGACTCGC ACAAGCATAT TTTATCATGC TGGCAGCTCT 120 

AGATTATTAA CTGTTGGTAA TCCATATTTT AGGGTTCCTG CAGGTGGTGG CAATAAGCAG 180 

GATATTCCTA AGGTTTCTGC ATACCAATAT AGAGTATTTC GGGTGCAGTT ACCTGACCCA 240 

AATAAATTTG GTTTACCTGA TAATAGTATT TATAATCCTG AAACACAACG TTTAGTGTGG 300 

GCCTGTGCTG GAGTGGAAAT TGGCCGTGGT CAGCCTTTAG GTGTTGGCCT TAGTGGGCAT 360 

CCATTTTATA ATAAATTAGA TGACACTGAA AGTTCCCATG CCGCTACGTC TAATGTTTCT 420 

GAGGACGTTA GGGACAATGT GTCTGTAGAT TATAAGCAGA CACAGTTATG TATTTTGGGC 480 

TGTGCCCCTG CTATTGGGGA ACACTGGGCT AAAGGCACTG CTTGTAAATC GCGTCCTTTA 540 

TCACAGGGCG ATTGCCCCCC TTTAGAACTT AAGAACACAG TTTTGGAAGA TGGTGATATG 600 

GTAGATACTG GATATGGTGC CATGGACTTT AGTACATTGC AAGATACTAA ATGTGAGGTA 660 

CCATTGGATA TTTGTCAGTC TATTTGTAAA TATCCTGATT ATTTACAAAT GTCTGCAGAT 720 

CCTTATGGGG ATTCCATGTT TTTTTGCTTA CGACGTGAGC AGCTTTTTGC TAGGCATTTT 780 

TGGAATAGGG CAGGTACTAT GGGTGACACT GTGCCTCAAT CCTTATATAT TAAAGGCACA 840 

GGTATGCGTG CTTCACCTGG CAGCTGTGTG TATTCTCCCT CTCCAAGTGG CTCTATTGTT 900 

ACCTCTGACT CCCAGTTGTT TAATAAACCA TATTGGTTAC ATAAGGCACA GGGTCATAAC 960 

AATGGTATCT GCTGGCATAA TCAATTATTT GTTACTGTGG TAGATACCAC TCGTAGTACC 1020 

AATTTAACAA TATGTGCTTC TACACAGTCT CCTGTACCTG GGCAATATGA TGCTACCAAA 1080 

TTTAAGCAGT ATAGCAGACA TGTTGAAGAA TATGATTTGC AGTTTATTTT TCAGTTATGT 1140 

ACTATTACTT TAACTGCAGA TGTTATGTCC TATATTCATA GTATGAATAG CAGTATTTTA 1200 

GAGGATTGGA ACTTTGGTGT TCCCCCCCCG CCAACTACTA GTTTGGTGGA TACATATCGT 1260 

TTTGTACAAT CTGTTGCTAT TACCTGTCAA AAGGATGCTG CACCAGCTGA AAATAAGGAT 1320 

CCCTATGATA AGTTAAAGTT TTGGAATGTG GATTTAAAGG AAAAGTTTTC TTTGGACTTA 1380 

GATCAATATC CCCTTGGACG TAAATTTTTG GTTCAGGCTG GATTGCGTCG CAAGCCCACC 1440 

ATAGGCCCTC GTAAACGTTC TGCTCCATCT GCCACTACGT CTTCTAAACC TGCCAAGCGT 1500 

GTGCGTGTAC GTGCCAGGAA GTAA 1524 

(2) INFORMATION FOR SEQ ID NO : 2 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 



PAGE: 3 RAW SEQUENCE LISTING DATE: 01/30/97 

, : ' , PATENT APPLICATION US/08/409,122 TIME: 09:39:18* 



INPUT SET: SI 5202. raw 



100 ( D) TOPOLOGY: linear 

101 

102 (ii) MOLECULE TYPE: protein 

10 3 (iii) HYPOTHETICAL: NO 

104 (iv) ANTI-SENSE: NO 

105 (v) FRAGMENT TYPE: N-terminal 

106 (vi) ORIGINAL SOURCE: 
107 

108 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
109 
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# # 

PAGE: 4 RAW SEQUENCE LISTING DATE: 01/30/97 

. t< ; , PATENT APPLICATION US/08/409,122 TIME: 09:39:20 * 

INPUT SET: SI 5202. raw 

153 340 345 350 

154 Pro Gly Gin Tyr Asp Ala Thr Lys Phe Lys Gin Tyr Ser Arg His Val 

155 355 360 365 

156 Glu Glu Tyr Asp Leu Gin Phe lie Phe Gin Leu Cys Thr He Thr Leu 

157 370 375 380 

158 Thr Ala Asp Val Met Ser Tyr He His Ser Met Asn Ser Ser He Leu 

159 385 390 395 400 

160 Glu Asp Trp Asn Phe Gly Val Pro Pro Pro Pro Thr Thr Ser Leu Val 

161 405 410 415 

162 Asp Thr Tyr Arg Phe Val Gin Ser Val Ala He Thr Cys Gin Lys Asp 

163 420 425 430 

164 Ala Ala Pro Ala Glu Asn Lys Asp Pro Tyr Asp Lys Leu Lys Phe Trp 

165 435 440 445 

166 Asn Val Asp Leu Lys Glu Lys Phe Ser Leu Asp Leu Asp Gin Tyr Pro 

167 450 455 460 

16 8 Leu Gly Arg Lys Phe Leu Val Gin Ala Gly Leu Arg Arg Lys Pro Thr 

169 465 470 475 480 

170 He Gly Pro Arg Lys Arg Ser Ala Pro Ser Ala Thr Thr Ser Ser Lys 

171 485 490 495 

172 Pro Ala Lys Arg Val Arg Val Arg Ala Arg Lys 

173 500 505 
174 

175 (2) INFORMATION FOR SEQ ID NO: 3: 
176 

177 (i) SEQUENCE CHARACTERISTICS: 

178 (A) LENGTH: 1389 base pairs 

179 (B) TYPE: nucleic acid 

180 (C) STRANDEDNESS : single 

181 (D) TOPOLOGY: linear 
182 

183 <ii) MOLECULE TYPE: cDNA 

184 (iii) HYPOTHETICAL: NO 

185 (iv) ANTI-SENSE: NO 

186 (V) FRAGMENT TYPE: 

187 (vi) ORIGINAL SOURCE: 
188 

189 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
190 

191 ATGGTATCCC ACCGTGCCGC ACGACGCAAA CGGGCTTCGG TGACTGACTT ATATAAAACA 60 

192 TGTAAACAAT CTGGTACATG TCCATCTGAT GTTGTTAATA AGGTAGAGGG CACCACGTTA 120 

193 GCAGATAAAA TATTGCAATG GTCAAGCCTT GGTATATTTT TGGGTGGACT TGGCATAGGT 180 

194 ACTGGAAGTG GTACAGGGGG TCGTACAGGG TACATTCCAT TGGGTGGGCG TTCCAATACA 240 

195 GTTGTGGATG TCGGTCCTAC ACGTCCTCCA GTGGTTATTG AACCTGTGGG CCCCACAGAC 300 

196 CCATCTATTG TTACATTAAT AGAGGACTCA AGTGTTGTTA CATCAGGTGC ACCTAGGCCT 360 

197 ACTTTTACTG GCACGTCTGG GTTTGATATA ACATCTGCTG GTACAACTAC ACCTGCAGTT 4 20 

198 TTGGATATCA CACCTTCGTC TACCTCTGTT TCTATTTCCA CAACCAATTT TACCAATCCT 480 

199 GCATTTTCTG ATCCGTCCAT TATTGAAGTT CCACAAACTG GGGAGGTGTC AGGTAATGTA 540 

200 TTTGTTGGTA CCCCTACATC TGGAACACAT GGGTATGAAG AAATACCTTT ACAAACATTT 600 

201 GCTTCTTCTG GTACGGGGGA GGAACCCATT AGTAGTACCC CATTGCCTAC TGTGCGGCGT 660 

202 GTAGCAGGTC CCCGCCTTTA CAGTAGGGCC TACCAACAAG TGTCTGTGGC TAACCCTGAG 720 

203 TTTCTTACAC GTCCATCCTC TTTAATTACC TATGACAACC CGGCCTTTGA GCCTGTGGAC 780 

204 ACTACATTAA CATTTGAGCC TCGTAGTAAT GTTCCTGATT CAGATTTTAT GGATATTATC 84 0 

205 CGTTTACATA GGCCTGCTTT AACATCCAGG CGTGGTACTG TGCGCTTTAG TAGATTAGGT 900 



* + 

PAGE: 5 RAW SEQUENCE LISTING DATE: 01/30/97 

. ... PATENT APPLICATION US/08/409, 122 TIME: 09:39:23 

INPUT SET: S15202.raw 

206 CAAAGGGCAA CTATGTTTAC CCGTAGCGGT ACACAAATAG GTGCTAGGGT TCACTTTTAT 960 

207 CATGATATAA GTCCTATTGC ACCCTCCCCA GAATATATTG AACTGCAGCC TTTAGTATCT 1020 

208 GCCACGGAGG ACAATGGCTT GTTTGATATA TATGCAGATG ACATAGACCC TGCAATGCCT 1080 

209 GTACCATCGC GTCCTACTAC CTCCTCTGCA GTTTCTACAT ATTCGCCCAC TATATCATCT 1140 

210 GCCTCTTCCT ATAGTAATGT AACGGTCCCT TTAACCTCCT CTTGGGATGT GCCTGTATAC 1200 

211 ACGGGTCCTG ATATTACATT ACCACCTACT ACCTCTGTAT GGCCCATTGT ATCACCCACA 1260 

212 GCCCCTGCCT CTACACAGTA TATTGGTATA CATGGTACAC ATTATTATTT GTGGCCATTA 1320 

213 TATTATTTTA TTCCTAAAAA GCGTAAACGT GTTCCCTATT TTTTTGCAGA TGGCTTTGTG 1380 

214 GCGGCCTAG 1389 
215 

216 (2) INFORMATION FOR SEQ ID NO: 4: 
217 

218 (i) SEQUENCE CHARACTERISTICS: 

219 (A) LENGTH: 461 amino acids 

220 (B) TYPE: amino acid 

221 (C) STRANDEDNESS : single 

222 (D) TOPOLOGY: linear 
223 

224 (ii) MOLECULE TYPE: peptide 

225 (iii) HYPOTHETICAL: NO 

226 (iv) ANTI-SENSE: NO 

227 (v) FRAGMENT TYPE: N-terminal 
22 8 (vi) ORIGINAL SOURCE: 

229 

230 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
231 

232 Met Val Ser His Arg Ala Ala Arg Arg Lys Arg Ala Ser Val Thr Asp 

233 15 10 15 

234 Leu Tyr Lys Thr Cys Lys Gin Ser Gly Thr Cys Pro Ser Asp Val Val 

235 20 25 30 

236 Asn Lys Val Glu Gly Thr Thr Leu Ala Asp Lys lie Leu Gin Trp Ser 

237 35 40 45 

238 Ser Leu Gly lie Phe Leu Gly Gly Leu Gly lie Gly Thr Gly Ser Gly 

239 50 55 60 

240 Thr Gly Gly Arg Thr Gly Tyr lie Pro Leu Gly Gly Arg Ser Asn Thr 

241 65 70 75 80 

242 Val Val Asp Val Gly Pro Thr Arg Pro Pro Val Val lie Glu Pro Val 

243 85 90 95 

244 Gly Pro Thr Asp Pro Ser lie Val Thr Leu lie Glu Asp Ser Ser Val 

245 100 105 110 

246 Val Thr Ser Gly Ala Pro Arg Pro Thr Phe Thr Gly Thr Ser Gly Phe 

247 115 120 125 

248 Asp lie Thr Ser Ala Gly Thr Thr Thr Pro Ala Val Leu Asp lie Thr 

249 130 135 140 

250 Pro Ser Ser Thr Ser Val Ser lie Ser Thr Thr Asn Phe Thr Asn Pro 

251 145 150 155 160 

252 Ala Phe Ser Asp Pro Ser lie He Glu Val Pro Gin Thr Gly Glu Val 

253 165 170 175 

254 Ser Gly Asn Val Phe Val Gly Thr Pro Thr Ser Gly Thr His Gly Tyr 

255 180 185 190 

256 Glu Glu He Pro Leu Gin Thr Phe Ala Ser Ser Gly Thr Gly Glu Glu 

257 195 200 205 

258 Pro He Ser Ser Thr Pro Leu Pro Thr Val Arg Arg Val Ala Gly Pro 



SEQUENCE VERIFICATION REPORT 

. PATENT APPLICATION US/08/409,122 I 



DATE: 01/30/97 
TIME:.09:39:25 



INPUT SET: SI 5202. raw 



Original Text 



